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On the Complexity of Nondeterministically 
Testable Hypergraph Parameters 
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Abstract 

The paper proves the equivalence of the notions of nondeterministic and 
deterministic parameter testing for uniform dense hypergraphs of arbitrary 
order. It generalizes the result previously known only for the case of simple 
graphs. By a similar method we establish also the equivalence between non¬ 
deterministic and deterministic hypergraph property testing, answering the 
open problem in the area. We introduce a new notion of a cut norm for hyper¬ 
graphs of higher order, and employ regularity techniques combined with the 
ultralimit method. 


1 Introduction 

Hypergraph parameters are real-valued functions defined on the space of uniform hyper¬ 
graphs of some given order invariant under relabeling the vertex set. Testing a parameter 
value associated to an instance in the dense model means to produce an estimation by only 
having access to a small portion of the data that describes it. The test data is selected by 
choosing a uniform random subset of the vertex set and exposing the induced substruc¬ 
ture of the hypergraph on this subset. A certain parameter is said to be testable if for every 
given tolerated error the estimation is within the error range of the parameter value with 
high probability, and the size of the selected random subset does only depend on the size 
of this permitted error and not on the size of the instance, precise definitions are provided 
below. Similar notions apply to testing graph properties, in that situation one also uses 
uniform sampling in order to separate the cases where an instance has the property or is 
far from having it, where the distance is measured by the number of edge modifications 
required. For the related notions of approximation theory and limits see [1], [3], and [4], 
The general reader is referred to [9], [11], and [13] for some related developments. 
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The notion nondeterministic testability was introduced by Lovasz and Vesztergombi 
[14] in the framework of graph property testing, and encompasses an a priori weaker 
characteristic than the original testability They defined that a certain property is nonde- 
terministically testable if there exists another property of colored (edge or node) graphs 
that is testable in the normal sense and serves as a certificate for the original. It was shown 
by the authors of [14] that for graph properties the two notions are equivalent, demon¬ 
strating that if a property is nondeterministically testable, then it is also testable. Their 
proof used the machinery of graph limits and for this reason it was of non-effective nature. 
Subsequently, an explicit construction of a tester was given by Gishboliner and Shapira 
[8] for nondeterministically testable graph properties containing the tester of the colored 
witness property as a subroutine. They used Szemeredi's Regularity Lemma combined 
with developments by Alon et al. [2], and provided a tower-type dependence between 
the sample complexity of the investigated property the sample complexity of the witness 
property. 

In [14], additionally the study nondeterministic testing for parameters was initiated, 
the definition is similar to the property testing situation. A different approach by Karpinski 
and Marko [10] relying on weaker regularity methods led to an effective upper bound on 
the sample complexity that is a 3-fold iteration of the exponential function applied to the 
sample size required by the witness parameter. 

The previous works mentioned above dealt with graphs, it was asked in [14] if the 
concept can be employed for hypergraphs. The notion of an r-uniform hypergraph (in 
short, r-graph) parameter and its testability can be defined completely analogously to the 
graph case, the same applies for nondeterministic testability. Naturally, first the question 
arises whether or not the deterministic and the nondeterministic testability are equivalent 
for higher order hypergraphs, and secondly, if the answer to the first question is positive, 
then what can be said about the relationship of the sample complexity of the parameter 
and that of its witness parameter. The statements that are analogous to the main results 
of [8], [10], and [14] do not follow immediately for uniform hypergraphs of higher order 
from the proof for graphs, like-wise to the generalizations of the Regularity Lemma new 
tools and notions are required to handle these cases. In the current paper we prove the 
equivalence of the two testability notions for uniform hypergraphs of higher order and 
settle the first question posed above. Unfortunately, we were not able to obtain an explicit 
upper bound for the sample complexity, this is the consequence of us applying of the 
limit theory for hypergraphs developed by Elek and Szegedy [6] using methods of non¬ 
standard analysis, therefore the second problem still remains open. We also show that 
testing nondeterministically testable properties is as hard as parameter testing with our 
method in the sense that the same complexity bounds apply. 

The paper is organized as follows. In Section 2 we give the preliminaries required 
and formulate the precise definitions followed by our main result. Theorem 2.3. Section 3 
contains the testability results for r-cut norms together with a brief summary of the notions 
and results regarding the ultralimit method that are needed for our purposes. Section 4 
comprises some auxiliary results required. Section 5 describes the proof of our main result. 


2 



In Section 6 we give an application for property testing of hypergraphs, and in Section 7 
we pose some questions related to possible further research. 


2 Preliminaries and main result 

A simple r-uniform hypergraph on n vertices is a subset G of the size of G is n, and 
the elements of (*"') are r-edges. Let A G denote the symmetric {0,l}-valued r -array or 
symmetric subset of [n] r that represents G, we will sometimes use also only the term G to 
refer to a symmetric subset of [n] r \ diag([/i] r ) corresponding to the array representation. 
Let k be a positive integer, and let Q r ^ denote the set of k-colored r-uniform hypergraphs 
of size n, that are partitions G = (G a ) ae [q of ( [ " ] ) into k classes, so in all what follows here 
colored r-graph means a complete r-graph where to each edge e we assign exactly one 
color G(e) from the set [7c]. In this sense simple r-graphs are regarded as 2-colored. In the 
/c-colored case is also possible to speak about the array representation, A G a stands for the 
symmetric {0,1 {-valued r-array that represents the color class of a, again with slight abuse 
of notation we will use G a for A G «- Additionally we have to introduce the color i and the 
corresponding array A, that always is the indicator array of the set of diagonal elements 
of [n] r (those having repetitions in their coordinates, denoted by diag([;i]' ))- For any finite 
set C the term C-colored graph is defined analogously. 

A k-coloring of a f-colored r-graph G = (G a ) ae [t] is a //c-colored r-graph G = (G^^^qq 
with colors from the set [/] x [k], where each of the original color classes indexed by a £ [/] 
is retrieved by taking the union of the new classes corresponding to (a, /3) over all /3 £ [k], 
that is G a = Jpeik ]G (a '^. This last operation is called k-discoloring of a [/] x [/c]-colored graph, 
we denote it by [G, k] = G. We will sometimes write //c-colored for [/] x [/c]-colored graphs 
when it is clear from the context what we mean. 

Further, for a finite set S, let h(S) denote the set of nonempty subsets of S, and h(S, m) 
the set of nonempty subsets of S of cardinality at most m. A real 2 r - 1-dimensional vector 
Xh(S) denotes (x T] ,..., x Tir _ r ), where T\, ..., T^-i is a fixed ordering of the nonempty subsets 
of S with T 2 r -1 = S, for a permutation n of the elements of S the vector x n (h(s)) means 
(Xtz'(Ti)/ • • • / x n >(T 2 r-i)), w here n' is the action on the subsets of S induced by n. 

We will require some basic notation from graph limit theory, and we summarize their 
relevance outlined in previous works, Lovasz [11] is a comprehensive reference for the 
area. 

Let q > 1 and G £ then G (q, G) denotes the random r -graph on q vertices that is 
obtained by uniformly picking a random subset S of [n] of cardinality q and taking the 
induced subgraph G[S]. For any F £ Q r f and G £ Q r,k the F-density of G is defined as 
f(F, G) = P(F = G(q, G)). 

Let the r-kernel space AV’ 0 denote the space of the bounded measurable functions 
W: [0, l] h( LL- 1 ) —> R, and the subspace AV r of AV' 0 symmetric r-kernels that are invariant 
under coordinate permutations induced by n £ S r , that is W(x h([r],r-i)) = W^n^dri^-i))) 
for each tl £ S r . We will refer to this invariance in the paper both for r-kernels and for 
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measurable subsets of [0, l] h ^ r D as satisfying the usual symmetries. Assume that the functions 
W £ 'Wj take their values in the interval I, for I = [0,1] we call these special symmetric 
r-kernels r-graphons. In what follows, A always denotes the usual Lebesgue measure in 
IR rf , where d is everywhere clear from the context. 

Analogously to the graph case we define the space of k-colored r-graphons 'W r,k whose 
elements are referred to as W = (W a ) ae [q with each of the W“'s being an r-graphon. The 
special color i that stands for the absence of any colors in the diagonal in some sense can 
be also employed in this setting, see below for the case when we represent a fc-colored 
r-graph as a graphon. The corresponding r-graphon W' is {0, l}-valued. Furthermore 
Tj a e[k] W a (x) = 1 - W l (x) everywhere on [0, l] h Tb-i)_ p or x 6 [0, l] h (M) the expression 
W(x) denotes the color at x, we have W(x) = a whenever W'(;th([r],r-i)) < %\r\ < 
Li=l W'(Xh([r],r-l))- 

Similar to the finitary case, a k-coloring of aWe TV"''' is a tk- colored r-graphon W = 
(W( a '®) ae [tip e [k] with colors from the set [t] x [k] so that Y,ae[t],pe[k] W^(jc) = W a (x) for each 
x £ [0, l] h (M' r_1 ) and a £ [t]. The /c-discoloring [W, k] of W and the term C-colored graphon 
is defined analogously, and simple r-graphons are treated as 2-colored. 

For q > 1 and W £ TV"^ the random [k] U {([-colored r-graph G(q, W) is generated 
as follows. The vertex set of G (q, W) is [q], we have to pick uniformly a random point 
(fys)seh([q],r-i) € [0, then conditioned on this choice we conduct independent trials 

to determine the color of each edge e £ Cf) with the distribution given by P e (G (q, W)(e) = 
a) = W a (Ah( e ,,--i)) corresponding to e. Recall that i is a special color which we want to avoid 
in most cases, therefore we will highlight the conditions imposed on the above random 
variables so that G (q, W) £ @ r ’ k . 

For F £ Q kk the F-density of W is defined as t (F, W) = P(F = G (q, W)), which can be 
written following the above description of the random graph as 

K F,w)= f TT W^WhJdAW- 
J [0,1^1)“) 

The above notions were introduced in order to provide a concise representation for 
the limit space of r-graphs in [6] and [12], in the current work we will not draw on this 
development explicitly but mention their relevance here. In a nutshell, a sequence of 
r-graphs converges if the corresponding numerical F-density sequences converge for all 
r-graphs F. One of the main results of [12] for graphs and [6] in the general case is that 
for every convergent sequence of r-graphs there exists an r-graphon they converge to in 
the sense that the F-densities approach the F-density of the limiting r-graphon. This was 
later reproved by [15] for general r with purely combinatorial methods that are similar to 
concepts employed in the current paper. 

We can associate to each G £ an element W G £ r W r,k by subdividing the unit 
cube [0, l] h «'U) into n r small cubes the natural way and defining the function W' : 
[0, l] h(W,1) —> [k] that takes the value G([z‘i,..., i r }) on [fyp-, £] x • • • x [fyp-, |] for distinct 
z'i,..., i r , and the value i on the remaining diagonal cubes. Then set (W G ) a (^h([r],r-i)) - 
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I(W , (ph([/'] / i)( x h([i] / c-i))) = ot) for each a € [7c] U {l}, where Ph(\r\,V) is the projection to the 
suitable coordinates. Note that 


|f(F,G)-t(F,W G )|<-SL (2.1) 

n ~ ( q 2 ) 

for each F e Q r f, hence the previous representation is compatible in the sense that 
lim^oo f(F, G„) = lim^oo f(F, W G „) for any sequence [G„}“ =1 with |V(G„)| tending to in¬ 
finity. 

We proceed by providing the necessary formal definitions of the parameter testability 
in the dense hypergraph model. 

Definition 2.1. An r-graph parameter f is testable if for any e > 0 there exists a positive integer 
cjf{e) such that for any simple r-graph G with at least q/(e) nodes we have that 

m(G)-f{G{q f {e),G)\>e)<e. 

The smallest function q f satisfying the previous inequality is called the sample complexity of f. 
The testability of parameters ofk-colored r-graphs is defined analogously. 

An a priori weaker characteristic than the one above, nondeterministic testability, is 
the second cornerstone of the current work, and was introduced in [14]. 

Definition 2.2. An r-graph parameter f is non-deterministically testable if there exist an integer 
k and a testable Ik-colored directed r-graph parameter g called witness such that for any simple 
graph G the value f(G) = max G g( G) where the maximum goes over the set of k-colorings of G 
(regarded as an element of Q r ’ 2 ). 

Originally in [14], the witness parameter was a function of /c-colored graphs, and the 
maximum was taken over the set of (k,m )-colorings of the original graph in order to 
determine the parameter value, meaning that the present edges are colored by elements 
of [in], absent ones by the remaining colors from [/c] \ [m\. Our modification is equivalent 
to that setting and is motivated by notational purposes. 

In the current paper we only deal with undirected structures, but similar results can 
be obtained when the witness parameter is defined on the space of directed r-graphs. In 
this case, in order to obtain G from G as above after the discoloring we additionally have 
to undirect the edges and neglect multiplicities created by the former operation. 

The maximization expression in Definition 2.2 is somewhat arbitrary and could be 
replaced for example by minimization, this would however not affect the testability char¬ 
acteristic of the parameter. Our main result extends the equivalence of the two testability 
notions for arbitrary r, this was first proved by Lovasz and Vesztergombi [14] for r = 2. 

Theorem 2.3. Every non-deterministically testable r-graph parameter f is testable. 
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Our proof follows the proof skeleton of [10], but requires a more sophisticated ap- 
proach.The reason for this is that the analogous norm for hypergraphs to the cut-norm 
that comes with a counting lemma has some shortcomings, for instance the sample is in 
some cases far away from the original in the natural distance notion induced by the norm. 
Therefore the corresponding regularity lemma cannot be applied directly as in [10]. 

The definitions of the relevant norms is given next. 

Definition 2.4. Let r > 1 and Abe a real r-array of size n. Then the cut norm of A is 

1 

\\A\\ n , r = — max \A(r; Si,..., S r )|, 

n SiC[n ] r_1 \diag([n] r_1 ) 

feW 

where A(r; Si, ... , S r ) = Y!i lr ..,i r =i A(U, ..., i r ) YTj=i I sfh, • • •, ij-i, ij+i, • • •, h), and the maximum 
goes over sets S,- that are invariant under coordinate permutations. 

IfP= (P,)| =1 is a partition of[n] r l \ diag([n] r_1 ) with symmetric classes, then the cut-P-norm 
of A is 


\\A\\ n ,r,p = — max Y \A(r,S 1 nP h/ ...,S r nP jr )\. 

n SiCM'-'Mr] . 1 

h ,-,Jr=y 

The cut norm of an r-kernel W is 


D,r = SUp 

S i c[0 / l] h([r “ 1]) ^n ie[r ]p-] N{i| (Si) 
feW 


f , 

'l n i£[r]P[ r ]w i) (Si) 


hV r (jVh([r],r—l))d/\ (^K([;-],r—1)) I, 


where the supremum is taken over sets S, that satisfy the usual symmetries, and p e is the natural 
projection from [0, if'Thr-'i) 0n j 0 [o,!] 11 ^). Furthermore, for a symmetric partition P = (P,) f =1 of 
[0, l] h (k-i]) the cut-P-norm of an r-kernel is defined by 


u,r,P = sup 
feW 


L if _ 

/'i 1 d n ie [ r ]P[ r]Mi) (Si n P ;; ) 


^( ;C h([r] / r-l))dA(Xh([r] / r-l))l, 


where the supremum is taken over sets Si that satisfy the usual symmetries. 
We remark that it is also true that 


a,r = SUp 

/ 1 ,...,/ r e[0,l] h < [r - 11 > 


I | | /}(^'h([r]\{/»)W r (Xh([ r ] /r _i))dA(Xh([r],r—l))l, 

f=l 




where the supremum goes over functions f that satisfy the usual symmetries, and similarly 
for any symmetric partition P = (P,)| =| of [0, l] h( ' r ~P we have with the same conditions for 
the f s as above that 


D,r 


h . fMO. 


t ~ r 

SU P Y I I FI /f( X h([?]\h'}))^y ( X h([r]\{/})) W(x h ([r], 7 —l))dA(x h ([r], ? —l))l- 
.e[0,l] h (^-i]) : i _i J i-t 
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In several previous works, see e.g. [1], the cut norm for r -arrays denotes a term that 
is significantly different from the one in Definition 2.4 and is not suitable for our present 
purposes. The above norms give rise to a distance between r-graphons, and analogously 
for r-graphs. 


Definition 2.5. For two k-colored r-graphons U = (U a ) ae [k] and W = {W a ) a e[k] their cut distance 
is defined as 

k 

rfn,r(U,W) = ^ 11ID - VJ a \\n,r> 

a-1 

and their cut-P-distance as 

k 

^(U,W) = ^||D«-W«|| n ^. 

a =1 


For two k-colored r-graphs G = (G a ) ae [k] and H = (H a ) a6 [q their corresponding distances are 
defined as 


d D ,r(G,H) = d n , r (W G ,W H ), 


and 


dn,r,p( G, H) = ^(W G ,W H ). 

Distances between an r-graph and an r-graphon, as well as for r-kernels, is analogously defined. 

Note that the norms introduced above are in general smaller or equal than the 1-norm 
of integrable functions, also d D r (U, W) < d nA p( U, W) hods for every pair. Their relevance 
will be clearer in the context of the next counting lemma, we include the standard proof 
only for completeness' sake. 

Lemma 2.6. Let U and W be two k-colored r-graphons with ||t/||,», 11IV11 CX3 < 1. Then for every 
F 6 Q r f it holds that 


|f(F,W) - f(F, U)| < |^jd n r (U, W). 

Proof. Fix q and F £ (f[' k . Then 

|f(F, W) - f(F,U)| = | f n Wm ( X He,r-l)) - n WF(e) (^r-p)dA(*)| 

[0,l] h (M^-i) ee(W) ee( l f) 

*1' f [W^iXb^D) - U F(e) (x h(e ,r- 1} )l 

ee Cf) [0,l] h <W' r -D 
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n ww) ( x w,r- d) n ^ %) (^, ) -i ) )dA(x)i 

fe( l f),f<e g£( l ?),e<g 

< Yj ||W F(e) - U m \\ D , r < Md n , r (U,W), 

ee(W) VJ 

where < is an arbitrary total ordering of the elements of ( q ). □ 

Let d tw denote the total variation distance between probability measures on , 
where [/c]* = [ k ] U {r} for k > 1 (without highlighting the specific parameters in the 
notion d lw )> that is di w (p, v ) = ma x rcg r ' m ' Ij^CF) ~ v C^)i and let the measure p(q, G ), respec¬ 
tively p(q, W), denote the probability measure of the random r-graph G(q, G), respectively 
G(q, W), taking values in Q r ^*. It is a standard observation then that 

dtw(^/W) / ^ / U)) = i Y |f(F,W)-f(F,U)|, (2.2) 

F 

and that G(q, W) and G(q,lJ) can be coupled in form of the random r-graphs Gi and G 2/ 
such that 


d tw (ji(q, W), p(q, U) = -P(Gi * G 2 ), 


(2.3) 


and further, for any coupling Gj and G' it hods that d tw (p(^, W), / i(q , U) < ^P(G( ^ G(). 
For G e Q r f note that 


dtwifiiq, G), p(q, W G )) < q 2 /n, (2.4) 

where the right hand side is a simple upper bound on the probability that if we uniformly 
choose q elements of an //-element set, then we get at least two identical objects. The 
inequality (2.4) follows from the fact that conditioned on the event that the independent 
and uniform X^'s for i e [q] fall in different intervals [-^, for j G [//] the distribution of 
G (q, W G ) is the same as the distribution of G(q, G). 

The next corollary is a direct consequence of Lemma 2.6. 

Corollary 2.7. If U and W are two k-colored r-graphons, then 

<U^,W),^,U) < ^-duA U,W), 

and there exists a coupling in form of Gi and G 2 of the random r-graphs G (q, W) and G (q, U), 
such that 

kf q r 

P(G 1 + G 2 ) < —U n#r (U,W). 

2 r! 
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A generalization of the notion of a step function in the case of graphs to the situation 
where we deal with r-graphs is given next. For a partition P the number of its classes is 
denoted by t P . 

Definition 2.8. We call an k-colored r-graphon W with r > l an (r,l)-step function if there exist 
positive integers ti, f/+i, ...,t r = k, symmetric partitions P = (Pi,...,P fj ) of [0, l] h(ra) , and real 
arrays A® : [f s _i] h([s l'S-i) —> [0,1] with a e [t s ]for l < s < r such that T^ a e\t s \ ^“( z ’h([s],s-i)) = 1 for 
any choice ofih([sU-i) and for s < r so that W a for a e [k] is of the following form for each a e [k]. 


_PI 

W‘'(Xh([;-])) = ^ A“(ih([r],r-1)) 


h =1 

Sc[r],/<|S| 


Z S —1 


n Ip, s (^h(S)) ]^[ I (X ^js|(*h(SJS|-l)) ^ X S < ^A^CZh^isi-i))). 

Se( [ A Sc[r] j =1 7=1 

v ! ' l+l<\S\<r 

We refer to the partition P as the steps ofW. 

The most simple example is the (r, r - 1) step function that can be written as 

tr-1 r 


W‘ 


'(*h(M)) - ^ Af( h/ . . . , i r ) I Pi Xx h ([,]\{ ; ))). 




7=1 


3 Testability of the r-cut norm 

We define a parameter of r-uniform hypergraphs that is a generalization of the ground 
state energies of [5] in the case of graphs. This notion encompasses several important 
quantities, therefore its testability is central to many applications. 

Definition 3.1. For a set H c a real r-array J of size q, and a symmetric partition P = 
(P 1 ,..., P t? ) of (j” 1 ,) we define the energy 

1 q 

j) = — Y J(ii,..., i r )e H {r ; P,,,..., P ir ), 

n 

where e H (r; Si,..., S r ) = |{(ui,..., u r ) e [n]'|A H (iq,..., u r ) = 1 and A Sj (ui,..., Uj- 1 , tq+i,..., u r ) 
1 for all j = 1,... ,r}|. 

Let H = (PP)ae[jt] W ci k-colored r-uniform hypergraph on the vertex set [n] and /'* a be real 
qx ■ ■ ■ x q r-array with H/Hoo < 1/or eac/z a e [A:]. Tfoerz f/?e energy for a partition P as above is 

S Prr - 1 (H,J) = Y j S Prr - 1 (H a ,n. 

ae[k] 
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The maximum of the energy over all partitions P of is called the ground state energy 
(GSE) of H with respect to J, and is denoted by 

fi r _ 1 (HJ) = maxfi nr _ 1 (H / /). 

The GSE can also be defined for r-graphons. 

Definition 3.2. For an r-graphon W, a real r-array J of size q, and a symmetric partition 
P = (P 1 ,..., P'0 of [0, l] h([r-1]) we define the energy 

J) — j(i\, , i r ) j l^(^h([r],r—l))dA(Xh([r],r—1))> 

Let W = (W a ) a£ [k] be a k-colored r-graphon and J a a be real q x • • ■ x q r-array with H/Hoo < 1 
for each a e [k]. Then the energy for a partition P as above is 

Gp^ 1 (W,J) = £ d Gps- 1 (W a ,n. 

ae[k\ 

and the GSE o/W with respect to J, and is denoted by 

6,-i (WJ) = supe^WJ), 

p 

where the supremum runs over all symmetric partitions P = (P 1 , ...,P q ) of [0, 

Definitions of the above energies are analogous in the directed, and the weighted case, 
and also for r-kernels. The next lemma tells us about the distribution of the GSE when 
taking a random sample G (n, H) of an H € Q r,k . 

Lemma 3.3. The expression 6,-_i(G(n, H),/) is highly concentrated around its mean, that is for 
every e > 0 it holds that 

2 

P(|6 r _L(G(n,H), J) - E6 r _!(G(n, H),/)| > e||/|U) < 2exp(~). 

Proof We can assume that ||/IU < 1. The random r- graph G(n, H) is generated by picking 
random nodes from l/(H) without repetition, let X, denote the ith random element of 
V(H) that has been selected. Define the martingale Y,- = E[6,_i(G(n, H),/)|Xi,..., X,] for 
0 < i < n. It has the property that Yo = E[6,-_i(G(n, H),/)] and Y„ = 6,-_i(G(n, H),/), 
whereas the jumps | Y; - Y,_, | are bounded above by f for each i 6 [n\. The last observation 
is the consequence of the fact that for any partition V of (*-”|) only at most m r ~ 1 terms in 
the sum constituting 6p /; _i(H, /) are affected by changing the placing of X,+i in the classes 
of P. Applying the Azuma-Hoeffding inequality to the martingale verifies the statement 
of the lemma. □ 
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The same concentration result as above applies also to £,_i(G(ft, W),/). 

We will show that these hypergraph parameters are testable via the ultralimit method 
and the machinery developed by Elek and Szegedy [6]. From the notational perspective 
and theoretical background this section slightly stands out from the rest of the paper. First 
we give a brief summary of the notions that were used in [6] in order to produce a repre¬ 
sentation for the limit space of simple r-graphs. This representation led to a new analytical 
proof method for several results for simple r-graphs such as the Regularity Lemma, the 
Removal Femma, or the testability assertion about hereditary r-graph properties. Subse¬ 
quently, technical results proved in [6] which are relevant here are mentioned, for more 
details and complete proofs we refer to the source paper [6]. 

Recall that a sequence of r-graphs (G„)„>i is convergent if for every simple F the 
numerical sequences f(P, G n ) converge when n tends to infinity. 

We start by introducing the basic notations for ultraproduct measure spaces. Fet 
us fix a non-principal ultrafilter co on IN, and let Xi,X 2 ,... be a sequence of finite sets of 
increasing size. We define the infinite product set X = n^i X; and the equivalence relation 
~ between elements of X, so that p ~ q if and only if {i\ pi = qfi € co. Set X = X/ ~, this set 
is called the ultraproduct of the X/s, and it will serve as the base set of the ultraproduct 
probability space. Further, let P denote the algebra of subsets of X of the form A = [{A, (“J, 
where A, c X, for each i, and [.] denotes the equivalence class under ~ (for convenience, 
p = [|p,}“ 1 ] G [{A,}^] exactly in the case when {i\pi G A,} G co). 

Define a measure on the sets belonging to P through the ultralimit of the counting 
measure on the sets X u that is, p(A) = lim„, j^j, where the ultralimit of a bounded real 
numerical sequence {x,}“ 1 is denoted by x = liny,, x„ and is defined by the property that 
for every e > 0 we have {z| |x — x ; j < e} G co. One can see that the limit exists for every 
bounded sequence and is unique, therefore well-defined, this is a consequence of basic 
properties of a non-principal ultrafilter. The set of N C 2 X of /.z-null sets is the family of sets 
N for those there exists an infinite sequence of supersets {A ! }“ 1 c P such that p(A') < 1/i. 
Finally define the a -algebra £ on X by the o -algebra generated by P and N, and set the 
measure p(B) = p{A) for each B G £, where AaB g N and A G P. Again, everything is 
well-defined, see [6], so we arrive at the ultraproduct measure space (X, £, p). 

Fet Xi, X 2 ,... and Y\, Y 2 ,... be two increasing sequences of finite sets with ultraprod¬ 
ucts X and Y respectively, then it is true that the ultraproduct of the product sequence 
X\ X Yi, X 2 x Y 2 ,... is the product XxY, but the o- algebra S Xx y of the measure space can 
be strictly larger than the o -algebra generated by £ x X £\/ an d this is a crucial point when 
the aim is to construct a separable representation of the ultraproduct measure space of 
product sets. 

Fet r be some positive integer, and again X\, X 2 ,... a sequence of finite sets as above. 
For any e c [r] we define the ultraproduct measure spaces (X 1 ’, £ x <, p, ), also let P e denote 
the natural projection from X^ to X e . Furthermore let o(e ) denote the sub-cr-algebra of 
£ x ir] given by P“ 1 (SxO/ and o(e)* be the sub-cr-algebra (P“ 1 (S X /)|/ c e,\f\ < \e\). Note that 
in general o(e) is strictly larger than u(e)\ We denote the measure p x - simply by p e and the 
a -algebra £ X t by £ e . 
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Definition 3.4. Let rhea positive integer. We call a measure preserving map <p : X w —> [0, l] h(W ) 
a separable realization if 

1. for any permutation n e S[ r ] of the coordinates we have for all x e X |r| that IT(p(x)) = 
(p((p(x )), where n is the permutation of the power set of[r ] induced by tl, and 

2. for any e c h([r]) and any measurable A c [0,1] we have that ^ 1 (A) e a(e) and <pf (A) is 
independent of o(e)*. 

We are interested in the limiting behavior of sequences of fc-partitions (or edge-k- 
colored r-graphs on the vertex sets X lf X 2/ ...) of the sequence Xj', Xf ..., where conver¬ 
gence is defined in the following general way. 

Let G, = (Gj, ..., Gf) be a symmetric partition of X[ for each i e IN, then (G ; )“ 1 converges 
if for every /c-colored r-graph F the numerical sequences t (F, G,j converge, as in Section 2. 
The ultralimit method enables us to handle the cases where the convergence does not hold 
without going to subsequences, we describe the method next. Let us denote the size of F 
by m and let F(e) be the color of e e then t(F, G,j can be written as the measure of a 
subset of X”'. We show this by explicitly presenting the set denoted by T(F, G ; ), so let 

T(F,G,)= Pi p; , (P,,W% (3.1) 

■*(“) 

where P e is the natural projection from X].'"' to X‘j, and P Se is a bijection going from X\' ' to 
X'.’ induced by an arbitrary but fixed bijection s e between e and [r]. We define the induced 
subgraph density of the ultraproduct of /c-colored r-graphs formally following (3.1), if 
G = (G 1 ,...,G^) is a S[ r ] -measurable /c-partition of X' r ' and F is as above then let 

T(F,G)= P P^fPJG^)). (3.2) 

■*('?) 


It is easy to see that A(T(F, G,j) = t(F, G ; ). Forming the ultraproduct of a series of sets 
commutes with finite intersection, therefore liny, T(F, G,j = T(F, lim a , G ; ) and lim„, f(F, G;) = 
t{F, lim,,, Gj). Observe that all of the above notation makes perfect sense and the identities 
hold true for directed colored r-graphs, that is, when the adjacency arrays of the G a 's are 
not necessarily symmetric. 

We call a measurable subset of [0, l] h ^ an r-set graphon satisfying the usual symme¬ 
tries in the coordinates induced by S r permutations, we can turn it into a proper r-graphon 
in the sense of Section 2 by generating the marginal with respect to the coordinate cor¬ 
responding to [r]. Analogously a /c-colored r-set graphon is a measurable partition of 
[0, l] h FF into k classes invariant under coordinate permutations induced by permuting 
[r]. These objects can serve as representations of the ultralimits of r-graph sequences in the 
sense that the numerical sequences of subgraph densities converge to densities defined for 
r-set graphons in accordance with the notation in Section 2, we will provide the definition 
next. 
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Definition 3.5. Let F be a k-colored r-graph on m vertices, and W = (W 1 ,..., W k ) be a k-colored 
r-set graphon. Then T(F, W) c [0,1 denotes the set of the symmetric maps g : h([m], r) —» 
[0,1] that satisfy that for each e e ( [ '" ] ) it holds that (g(f))/e h(e) € W F(e) . For the Lebesgue measure 
ofT{ F,W) we write t( F, W), this expression is referred to as the density o/F in W. 

The reader may easily verify that the above definition of density agrees with the content 
of Section 2. One of the main technical results of [6] is the following. 

Theorem 3.6. [6] Let r be an arbitrary positive integer and let FA be a separable sub-a-algebra of 
S[ r ]. Then there exists a separable realization <p : X w —> [0, l] h(M) such that for every A e there 
exists a measurable B c [0, l] h(M) such that p[,.](AAp _1 (£>)) = 0. 

A lifting of a separable realization f >: —> [0, l] h ^ of degree n for n > r is a measure 

preserving map ip : X [ ” ] —> [0,1 ] h([n],r ) that satisfies Ph([r]) °ip = <p° P\ r \, and it is equivariant 
under coordinate permutations in S nr where pu(\r\) and P\ r \ are the natural projections from 
[0, l] h T J b) [0, l] h (M) / and from X^'^ to X' r ' respectively The next lemma is central to 
relate the sub-r-graph densities of ultraproducts to the corresponding densities in r-set 
graphons. 

Lemma 3.7. [6] For every separable realization <p and integer n > r there exists a degree n lifting 

The next statement is the colored version of the homomorphism correspondence in [6] 
(Lemma 3.3. in that paper). 

Lemma 3.8. Let <fi be a separable realization and W = (W 1 ,..., W k ) be a k-colored r-graphon, 
and let H = (FI 1 ,..., FI fc ) be a k-colored ultraproduct with p[ r ](H a Acp -1 (W a )) = 0 for each 
a G [*]. Let ip be a degree m lifting of <p and F be a k-colored r-graph on m vertices. Then 
p [m] (^- 1 (T(F,W))AT(F,H))=0; and consequently t( F, W) = i(F, H)/or each F. 

Proof By definition we have that 

T(F,H)= p| F'lFrt 

<€(M) 

and 

T(F, W) = P 

Due to the fact that ip commutes with coordinate permutations from S n and the conditions 
we imposed on the symmetric difference of H“ and p -1 (W tt ) the statement follows. □ 

We turn to describe the relationship of two r-set graphons whose F-densities coincide 
for each F. For this purpose we have to introduce two types of transformations and clarify 
their connection. Let us define the cr-algebras Pis, TAf and S s c X|o,i| h (W) l° r each S c [r], 
the o -algebra S s = p(j 1 (X[o / i| h ( s ))/ is {& T \T c S), and TfT s is (S T \T c S, T S), where X[o,i]< 
denotes the Lebesgue measurable subsets of the unit cube with the dimension given by 
the index. 
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Definition 3.9. We say that the measurable map (p : [0,l] h(M> —» [0,l] h ^ r h is structure preserving 
if it is measure preserving, for any S c [r] we have (p~ x {fZ Is) c J{ s ,for any measurable 1 c [0,1] 
we have is independent of &l* s , and for any n £ S r we have n o <p = <p o n, where n is 

the coordinate permutation action induced by tl. 

Let X h(M) denote the measure algebra of ([0, l] h(M) , X[o,i]h(M), A). 

Definition 3.10. We call an injective homomorphism O : X h(|, 'D —» a structure preserving 

embedding if it is measure preserving, for any S c [r] we have 0(S s ) c yi s , 0(S s ) is independent 
from TA* S , and for any n £ S r we have n o O = O o n. 

Another result from [6] sheds light on the build-up of structure preserving embeddings. 

Lemma 3.11. [6] Suppose that ® : X h(|r|) —> is a structure preserving embedding of a 

measure algebra into itself Then there exists a structure preserving map <p : [0,l] h(M) —> [0,l] h(M) 
that represents <E> in the sense that for each [ U ] £ L h(|r|) it holds that 0([L/]) = [cA 1 (l/)], where U 
is a representative of[U], 

A random coordinate system t is the ultraproduct function on X' r ' of the random 
symmetric functions z n : [n] r —> [0, l] h (l”l' r ) that are for each n given by a uniform random 

point Z n in [0, l] h([ ”L0 so that {z n {i lf ...,i r )) e = (Z n ) Mh . ir) . An important property of the 

random mapping t„ is that for any r-set graphon and positive integer n it holds that 
(t n) -1 ^) = G (n, U), when the random sample Z n used to generate the two objects is the 
same. 

Lemma 3.12. [6] Let U be an r-set graphon, and let H = [{G(n, U)}“ =1 ]. Then the random 
coordinate system t = [|t„}“ =1 ] is a separable realization such that with probability one we have 
// m (H \7 '(!/)) = 0. 

A direct consequence is the statement for /c-colored r-set graphons. 

Corollary 3.13. Let U = (U 1 ,..., U k ) be a k-colored r-set graphon, and let H = (H 1 ,...,H ,c ) be 
a k-colored ultraproduct in X |r| , where H a = [{G(n, Lf a )}” =1 ]/or each a £ [7c]. Then a random 
separable realization t is such that with probability one we have p[ r ](H a AW 1 (U a )) = 0 for each 
a £ [k]. 

The following result is a generalization of the uniqueness assertion of [6], and states that 
subgraph densities determine an r-set graphon up to structure preserving transformations. 

Theorem 3.14. Let U = (IT 1 ,..., U k ) and V = (V 1 ,..V k ) be tzvo k-colored r-set graphons such 
that for each k-colored r-graph F it holds that t(F, U) = t(¥, V). Then there exist two structure 
preserving maps V\ and v 2 from [0,l] h(M) to [0,l] h(W) such that p\ r \(vf (U a )Avf (V a j) = 0 for 
each a £ [7c]. 

Proof The equality 7(F,U) = 7(F,V) for each F implies that G(n,U) and G(n,V) have 
the same distribution Y n for each n. Let H = [{T, ! }]]! =1 ] / then Corollary 3.13 implies 
that there exist separable realizations (pi and <p 2 such that p[ r ](H a Acpj 1 (U a )) = 0 and 
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[i[ r ](H a Atp- 1 (V a )) = 0 for each a e [k], therefore also p^cp^fU 01 ) A(^“ 1 (V’ a )) = 0. Set 
3K = (j({/>“ 1 (X[o,i]h(M)),^^ 1 (X[o / i]h(M))) that is a separable cr-algebra on X |r| so by Theorem 3.6 
there exists a separable realization <p 3 such that for each measurable A c [0, l] h (M) the 
element tp7 1 (A ) of can be represented by a subset of [0, l] h(M) denoted by i pfA). It is easy 
to check that the maps i pi and ip 2 defined this way are structure preserving embeddings 
from £ h(M) —» satisfying A(ipi(U a )Aip 2 (V a )) = 0 for each a e [A:]. We conclude that 

by Lemma 3.11 there are structure preserving V\ and v 2 such that A(v~ 1 ( lt a )Arg 1 (V' a )) = 0 
for each a e [*]. □ 

The next result is perhaps also meaningful beyond the framework of this paper and 
is the main contribution in the current section. Recall the definition of the ground state 
energies (GSE), Definition 3.1 and Definition 3.2. 

Theorem 3.15. For any J = (J 1 ,..., J k ) with J a being a real r-array of size q for each a e [ k ] the 
parameter ofk-colored r-graphs S r -i (.,/) is testable. 

Proof. We may assume that |[/' l || co < 1 for every a without losing generality. We proceed 
by contradiction. Suppose there exist an e > 0 and a sequence of /c-colored r-uniform 
hypergraphs {H„}°^ =1 with l/(H„) = [m n ] tending to infinity that are such that for each n with 
probability at least e we have that /) + e < S r -i(G(n, H„), /). Let G„ = (Gf ..., G k ) 

denote the random ^-colored hypergraph G(n,H„) for each n with G a n = G(n, Hf). The 
previous event can be reformulated as stating that for each n with probability at least e 
there is a partition P n = (Pf..., Pf) of ( ; [ ”|) such that the expression 

1 k q 

L r{h .V>e G ;(>-;P;. 

a =1 i\,...,i r =l 

is larger than 

1 k q 

— Yj Tj Hil. ir)e H ;(r:K . K) + c 

]Un a=l h . i r =1 

for any partition If, = (Rf..., R q n ) of 

Let H denote the ultralimit of the hypergraph sequence {H„}” =1 that is a k-partition in 
the measure space pf), and let <Ji(S) and ai(S)* denote the sub-cr-algebras of C B\ 

corresponding to subsets S of [r]. Due to Theorem 3.6 there exists a separable realization 
(pi : —» [0, l] h (D) such that there is a A:-colored r-set graphon W = (W 1 ,..., W k ) satisfying 
pi((/)“ 1 (W a )AH n ) = 0 for each a 6 [k]. Let G(s) stand for the point-wise ultralimit realization 
of the {G„(s)}” =1 c X^ 1 for all s 6 S, where (S, S, v ) denotes the underlying joint probability 
space for the random hypergraphs, and (Xj,' 'B 2 , p 2 ) is the ultraproduct measure space in 
the case of the sample sequence, o 2 (S) and o 2 (S)* are the corresponding sub-cr-algebras. 
Note that the ultralimits G(s) are not /c-partitions of the same ultraproduct space as H, 
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moreover, it is possible that the o -algebra generated by {G(s)|s e S} together with p 2 form 
a non-separable measure algebra that prevents us from using Theorem 3.6 directly. 

Suppose that for some n we have that Efi r _ 1 (G„,/) < £) r _i(H„,/)+3/4e. This assumption 

implies by Lemma 3.3 that P(£ r _i(G„,/) > G r -i (H„,/) + e) < P(£ r _i(G„, J) > EG r -i(G n , J) + 

2 

e/4) < 2exp(-|p-). The last bound is strictly smaller than e when n is chosen sufficiently 
large, therefore it contradicts the main assumption for large n. Therefore we can argue 
that E<5 r _i (G„, /) > fi r _ 1 (H„,/) + 3/4e for large n, throwing away a starting piece of the 
sequence {H„}“ 1 we may assume that it holds for all n. 

A second application of Lemma 3.3 leads to a lower bound on the probability that 
fir -1 (G„,/) is close to <5 r _i(H„,/), namely P(<5 r _i(G f „/) < £,-i(H„J) + e/2) < 2exp(-0). 
Hence, by invoking the Borel-Cantelli Lemma, we infer that with probability one the event 
G r -i(G n/ J) < 8,-\ (H,„ /) + e/2 can occur only for finitely many n, let the Mi denote the 
(random) threshold for which is true that fi,_i(G„,/) > <S,_i (H,,, /) + e/2 for every n > Mi. 
It follows that lim„, £,._i(G„, /) > lim a , fi r _i(H„, /) + e/2 with probability 1. 

Next we will show that with probability one G is equivalent to H in the sense that for 
each /c-colored r-graph F it holds that t( F, G) = f(F, H). Then, since there are countably 
many test graphs F, we can conclude that the equality holds simultaneously for all F with 
probability 1. 

We have seen above in the paragraph after (3.2) that for every fixed /c-colored r-uniform 
hypergraph f(F, H) = lim„, f(F, H„). On the other hand the subgraph densities in random 
induced subgraphs are highly concentrated around their mean, that is 

P(|t(F,G„) - f(F,H„)| > 6) < 

for any 5 > 0, this follows with basic martingale techniques, see Theorem 11 in [6] for the 
almost identical statement together with a proof. The Borel-Cantelli Lemma implies then 
for every fixed F that with probability one for each 5 > 0 there exists a (random) n 0 (5) such 
that for each n > n 0 (6) it is true that |f(F, G„) - t( F, H„)| < 6/2. Let us fix 6 > 0 and F e Q r ' k . 
Since the set {n\ \t(F, H„) - t(F, H)| < 6/2} belongs to a> by the definition of the ultralimit 

function, it holds that {n\ |f(F, G„) - t(F, H)| < 6} £ co as a consequence of 

[n\ |f(F,G„) - f(F, H)| < 6} 

D({n| |f(F, G fI ) - f(F, H fi )| < 6/2} n {n| |f(F, H„) ^ t(F, H)| < 6/2}) e co. 

Consequently, lim a , t( F, G„) = t( F, H) with probability one for each F, and the limit equation 
holds simultaneously for each F also with probability one, since their number is countable. 

Let us pick a realization {G„(s)}“ =1 of {G„}“ =1 such that it satisfies lim w £ r _ 1 (G n (s),/) - 
lim^fir-i(H„,/) > e/2 and lim w f(F, G„(s)) = f(F, H) for each F, the preceding discussion 
implies that such a realization exists, in fact almost all of them are like this. Furthermore, 
let us consider the sequence of partitions P n = (P } u ..., P q „) of (]”}) that realize £ r _i(G, / (s), /), 
and define T}/ c [n] r \ diag([;/]') through the inverse images of the projections A T i,j = 
(p'!)~ 1 (A P iJ for i e [q], j e [r], and n G N, where p 1 ' is the projection that maps an r- array 
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of size n onto an (r - l)-array by erasing the jth coordinate. Note that the T^'s are not 
completely symmetric, but are invariant under coordinate permutations from S[ r ]\{yj for 
the corresponding j G [r], A further property is that and T’ k can be obtained from T]'/ 2 
swapping the coordinates corresponding to j\ and j 2 . 

We additionally define the ultraproducts of these sets by P' = [{PJJA, ] c x!,' -11 and 
T'' ; = [|T|; y !” | ] c x!)', it is clear that T' ,; G o 2 ([r] \ {;}) for each pair of i and j, so P(z,/)eiT !,; G 
o 2 ([r]y for any I c [q] X [r], and that x!,' _1] = U/P'. The same symmetry assumptions apply 
for the T’i's as for the T^'s described above. 

We also require the fact that these ultraproduct sets defined above establish a cor¬ 
respondence between the GSE of G(s) and the ultralimit of the sequence of energies 
{6r-l(G „(s),/)}~ =1 . 

This can be seen as follows: Recall that 

1 k q 

£,_,(G = E m. <r)K n (n’ = 1 r;; J )i 

a =1 


This formula together with the identities [{G“(s) (T (Piy =1 h^)}“ =1 ] = G a (s) H (P'| ,TP' 7 ), and 
that the ultralimit of subgraph densities equals the subgraph density of the ultraproduct 
imply that 


k q 

lim£ I ._ 1 (G„(s)J) = ^ Yu / a (h---AV)p 2 (G«(s) P (P’ = 1 TW))- 

a =1 ii,...,i r =\ 

Now consider the separable sub-cr-algebra of S 2 generated by the collection of the sets 
G^s),..., G k (s), T 11 ,..., T' ?/r . Then by Theorem 3.6 there exists a separable realization (p 2 : 
xW —> [0, l] h dd) anc [ measurable sets U 1 ,..., U k , V 1,1 ,..., V q,r such that fi 2 ((p~ 1 (U a )AG a (s)) = 
0 for each a e [k] and = 0 for every i 6 [q\,j G [r]. Additionally, we 

can modify the V l, i's on a set of measure 0 such that each of them only depends on 
the coordinates corresponding to the sets in h([r] \ {/}), is invariant under coordinate 
permutations induced by elements of S[,.j that fix j, and V t,qi can be obtained from V l, > 2 
by relabeling the coordinates according to the S r permutation swapping j] and j 2 . Also, 
(If 1 ,..., U k ) form a ^-colored r-set graphon U when we make modifications on null sets. 
Most importantly, the separable realization <p 2 is measure preserving, so we have that 

k q 

limS^G n (s),J) = Y Y J a (h,...,ir)MU a n(n r j=1 V i r’)). (3.3) 

«=1 il,...,i r =l 

On the other hand we established that f(F, G(s)) = t( F, H) for each F,which implies 
f(F, U) = f(F, W), therefore the uniqueness statement of Theorem 3.14 ensures the exis¬ 
tence of two structure preserving measurable maps \’\,v 2 : [0, l] h( M) —> [0, ljM'T such that 
A(v“ 1 (W a )Av“ 1 (li a )) = 0 for each a G [k\. 
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Now let us define the sets S' ,; = (p~ 1 (v 2 (v~ 1 (V l, i))), these satisfy exactly the same symme¬ 
try properties as the T'4's above, by the measure preserving nature of the maps involved 
we have that 

k q 

limfi^G „(s),J) = Yu Tj H fl n(n; =1 SW)). (3.4) 

a =1 ii,...,ir=l 

The properties of structure preserving maps imply that S' ,? G cri([r] \ {;}) for each i,j, 
so r\ (h j )e ,S 1 ’’ G u i ([r])* for any I C [q] X [r]. Also, the ultraproduct construction makes 
it possible to assert the existence of a sequence of partitions = (R\, ...,R q n ) of ( [ r m " 1 ] ) 
for ce-almost every n such that S' ,; = [{(p" , ”) _1 (^,)}” = J. But again by the correspondence 
principle between ultralimits of sequences and ultraproducts in Lemma 3.8 applied to 
(3.3) and (3.4) we have 


lim<5^ i/r _i(H n ,/) = lim fi r _ 1 (G„(s),/), 

0) 0) 

which contradicts lim a , £ r _i(G, ; (s),/) - liny, (H„, /) > e/2. 


□ 


An immediate consequence is that the above theorem is also true for r-graphons. 

Corollary 3.16. For any J = (J 1 ,..., j k ) with j a being a real r-array of size l for each a G [ k ] there 
exists for any e > 0 a q(e) integer such that for any k-colored r-set graphon W and q > q(e) it 
holds that 

P(|fir-^WJ) - 8 r _x(G(q, W),/)| > e) < e. 

Proof. We only sketch the proof here, details are left to the reader. The main idea is 
to find for any fixed e > 0, and for each /c-colored r-set graphon W a G G Q r ’ k such 
that their GSE are sufficiently close, and further, the distributions of G(q {] (i:/2), W) and 
G(^ 0 (e/2), G) are close enough in terms of e, where q 0 is the sample complexity of 1 (-, J), 
whose existence is ensured by Theorem 3.15. Fix e > 0, and let W be a fc-colored r-set 
graphon. By measurability for any A > 0 there exists an integer l and a ^-colored r-set 
graphon U such that each U a is a union of cubes X Se h([r])[^-, y] with z G Z h(M) and 
Tj k a =i 11 - L/' v 11 1 < A. For a fixed, but sufficiently small A, let G be the ^-colored r-graph 

on l vertices that is obtained by randomization form U using the independent uniform 
[0, l]-valued random variables () Se ha/1 )\h a/i/i ) • Then by standard large deviations results 
it follows that the 1-norm of IF - W G « is highly concentrated around 0. By definition, the 
deviation of the GSE's of two r-graphons can be upper bounded by a constant multiple 
of their difference in the 1-norm. By Corollary 2.7 the same is true for the total variation 
distance of the corresponding measures for the fixed sampling depth q 0 (e/ 2), as the cut- 
norm is dominated by the 1-norm. The quantity £ a=1 1| W® - W G « ||i can be made arbitrarily 
small by the above discussion, which proves the result. □ 

We can derive a substantial property of the cut norm form the above theorem. Recall 
the definition of the relevant norms. Definition 2.4. 
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Lemma 3.17. Let r > 1. For any e > 0 and t > 1 there exists an integer l 0 (r, e, t) >1 a such that 
for any symmetric r-kernel U that takes values in [—1,1], and for any integer l > l Q (r, e, t ) it holds 
with probability at least 1 - e that 


sup 11 LT| | 

Q,tQ<t 


sup ||W G(// u ) || D/) . (3 

Q,tQ<t 


< £, 


where the supremum at both places goes over symmetric partitions Q of[ 0, l] h([r 1] ) into at most t 
classes. 


Proof. Let us fix e > 0, r, t > 1, and let IT be arbitrary. In this lemma we deal with r-graphons 
instead of r-set graphons, Fubini's Theorem ensures that we can apply Theorem 3.15 
correctly later on. 

Showing that there exits an Iq not depending on IT such that for each l > l o it holds that 
sup Q t <t I|l/1 Q - sup Q t <t || FJc(i,u)\\n,r,Q < £ with failure probability at most e/2 is a routine 
exercise, we only have to consider a tuple (S,), 6 [ r ] of symmetric subsets of [0, l] h (F-i]) an d a 
symmetric partition Q° of [0, l] h (h-i]) into a t most t classes such that 

SUp \\U\\ n , r ,Q = V I I LJ(Xh([rU-1))dA(Xh([r],r-l))l/ 

Q,tQ<t ji,...,jr=1 ^ n ieWP[ r ]\{i)( S i n Qy ; ) 

and use Markov's inequality. The difficult part is to show that if l is large enough then for 
each U it holds that 

sup ||W G(//U) || n/) . Q - sup \\U\\ n ,rjQ < £ 

Q,t a <t Q,tQ<t 

with probability at least 1 - e/2. 

First we have to discretize the range of U in order to apply the above result on fc-colored 
r-graphs. Corollary 3.16. Therefore we split the interval [-1,1] into consecutive intervals 
/i,..., 4 of length at most e/4, let i/, = inf/, for each j e [k], and define the r-kernel 
W(x) = Y}j= 1 1 ij(U(x))yj. Then ||LT - W|U < e/4, so therefore |||ir|| D/ ,.o - ||W|| D/) . Q | < e/4 and 

IllWc^Lolln,^ _ l|W G (/ / w)||n,r / <a| < e/4 for any Q and l. Thus, it suffices to show the existence 
of an Iq not depending on LI or W such that for each l > Iq we have 

l|W G (Z,W)lln,r / Q - ||W|| n ,, Q < e/2 

for each partition symmetric Q of [0, l] h ^ r_1 ^ into at most t classes simultaneously with 
probability at least 1 - e/2. 

We can rewrite sup QtQ<t ||W|| D/r/ Q as an optimization problem, more precisely 


sup 11 W| 

Q,t a <t 


t 


= sup max sup 

Q,t 0 <t AeA TyC[0 / l] h ( [r - 11) 
Mr] 


Tj A(h ‘ 


(3.5) 


ir) 


J " W (^h([r] /r -l)) Y\ 

lh(M.r-l) i =1 




I TnQ i .(^h(M\{;}))dA(Xh([r] / r-l)), 


(3.6) 
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where A denotes the set of all r-arrays of size t with {-1,1} entries, and the set and 
partitions involved are symmetric. 

If we swap the order of the maximization operation on the right of the above formula 
(3.5), then it can be turned into a generalized energy for each A £ A. In more detail, 
consider W as a /c-colored r-graphon with \N‘ X = I w=lJa for each a £ [7c], with slight abuse 
of notation we set W = (W a ) a6 [q. We also define the r- array Bo of size 2 r , indexed by the 
power set of [r] so that B 0 (is 1 ,..., is r ) is equal to 1 if for every j £ [r] we have j £ Sj, and is 
equal to 0 otherwise. Let = y tt (A 0 B 0 ) be the tensor product of A and B 0 for each A £ A 
multiplied with the scalar y a with a £ [k], then /'(is an r- array of size 2't. It follows that 

ma x6r-i(W,J A ) = sup ||W|| D/) . /Q . (3.7) 


Similarly, 


hence 


max (Wc(i,w), Ja) - sup ||W G( / /W )|| 
AeA Q,t a <t 


sup WcftmWnsjQ - sup ||W|| n ,, Q < max IS,.! (Wcftwy /a) - fi r -i(W,/ A )|. 

Q,t a <t Q,t a <t AeA 

The function fi,_i(., Ja) is testable by Corollary 3.16, say with sample complexity cji(e, r, l, k), 
so sup^ t <t 11.11 n ,r ,<3 is testable with sample complexity /o(r, e, t ) = yi(e/|A|, r, m, 2't). 

□ 


In fact, we will require the version of Lemma 3.17 for /c-tuples r-kernels. 

Lemma 3.18. Let r,k > 1. For any e > 0 and t > 1 there exists an integer cj cvt (r,k, e, t) > 1 
a such that for any k-tuple of r-kernel IZi,..., L// f that take values from [-1,1], and any integer 
J > rjc ut(b k, e, t) it holds with probability at least 1 - e that 


k k 

sup V WUjWa^Q - sup V ||W G((7 , U .)|| n ,,o 

Q,tQ<t j =i Q,tQ<t j = i 


< e 


where the supremum at both places goes over symmetric partitions Q of[ 0, l] h(|r 1]) into at most t 
classes. 


Proof. We only sketch the proof as it is almost identical to that of Lemma 3.17. Let 
r,k,t > 1 and e > 0 be fixed, and let Lfi,..., 14 and q be arbitrary. The lower bound on 
SU P (3 1 <t Ey=i II^Gq.Uyjlln^Q can be obtained by the same argument as above using Markov's 
inequality. For the upper bound we again discretize to obtain the r-kernels Wi,..., W/ c 
with common range {y,-: a £ [/;/]} such that \\Uj - Wy||oo < jj : for each j £ [, k ], hence m = — 
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will do. We associate to each Wy and m-colored r-graphon W, as above and set J" to 
y a (A 0 B 0 )/ then 


k k 

max sup V fi Q/r _i(W ;/ / A ) = sup YJWy|| n , r/Q . 

Al . AkeA Q,t a <tj^ at Q <tjrt 


Similarly, 


k k 

max sup Y£ ar -i(W G (,w )jA ; ) = sup Y ||W G((7/W) || n , r , Q . 

Al . k Q,ta<t 


The testability of sup^, <f £y =1 &Q,r-i (Wy, Ja) follows from Theorem 3.15 with a slight 
modification of the argument for any fixed tuple A\, ..., £ A. As the cardinality of A 

does not depend on Wj,..., W/ f the statement of the lemma follows. □ 


4 Auxiliary lemmas 

We will require the version of Szemeredi's Regularity Lemma adapted to the Hilbert space 
setting. Let us recall this variant. 

Lemma 4.1. [13] Let Wi, W 2 ,... be arbitrary subsets of a Hilbert space r H. Then for every e > 0 
and f £ 'H there is an rn < f and there are f 6 ( K, and y,- £ IR (1 < i < m) such that for every 
g 6 W m+ 1 we have that 


i= 1 

We start with the following intermediate version of the regularity lemma for edge 
^-colored r-graphons, the partition obtained here satisfies stronger conditions than those 
imposed by the Weak Regularity Lemma [7], and weaker than by Szemeredi's original. 

Lemma 4.2. For every r > 1, e > 0, t > 1, k > 1 and k-colored r-graphon W there exists a 

symmetric partition P - (Pi ,..., P m ) of [0, l] h([r_1 ]) into m < (2f) <,i+1)4/ ' = t reg (r, k, e, t) parts and 
a symmetric (r, r - 1)- step function V e 'W r ’ k with steps from P, such that for any partition Q of 
[0, l] h([r W) into at most mt classes we have 

dn,rd W,V) < £• 

Proof. Our lemma is a special case of Lemma 4.1. We set *H to be the space of of fc-tuples of 
real measurable functions on [0, l] h (h'L r_1 ) with the sum of the component-wise L 2 -products 
as the inner product, this space contains r W r,k . Set s(l) = 1 and s(i + 1) = s(i)(s(i)t + l) rk 
for each i > 1 and let W, be the set of /c-tuples of indicator functions that are (r, r - 1) step 
functions with s(i) number of symmetric steps and taking values from the set {-1,0,1}. 
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Note that the elements of the %'s are not necessarily symmetric as functions, only their 
steps are required to be such. Further, observe that s(i) < (2 t) ( - rk+1 ' > '. Now apply Lemma 4.1 
with the above setup for e/2 and W to obtain U that satisfies all the conditions of the 
lemma except for symmetry, in particular 

k 

\U a -W a \Ur<e. 

a-1 

Define V with V^Xh^-i)) = y Tj n es r ^ a ( x n(h([r],r-i )))• The symmetry of W and the triangle 
inequality implies that V is suitable, since 

ll^ fl - W1 □,,*> <^Yj IK^T - (WTII u,r,P = II u a - w a \\ a ,, P 

TieSr 


for any P and a £ [k], and U n (x h«r],r-i)) = U(x n(h(lr]rr - W ). 


□ 


The next lemma is analogous to Lemma 3.2 from [10]. It describes under what metric 
conditions a fc-coloring of a f-colored graphon can be transfered to another one so that 
the two tk- colored graphons are close in a certain sense. For the sake if completeness we 
sketch the proof. 

Lemma 4.3. Let e > 0, U be a t-colored r-graphon that is an (r, r - l)-step function with steps 
P = (Pi,... ,P m ) and V be a t-colored r-graphon with d Uiri p(U, V) < e. For any k > 1 and U a 
[f] x [k]-colored r-graphon that is an (r, r - l)-step function with steps from P such that [U, k] = U 
there exists a k-coloring ofX denoted by V so that 

da,r,p( U, V) < ke. 

Proof. Fix e > 0, and let U = ( U a ) ae[t] , V = (V a ) ae[t] and U = (IF'OaeM^eM as in the 
statement of the lemma. Then Y! a =i U a = 1 and X^=i = U a f° r each a £ [f]. Let us 

define V = (V a F) ae [ t ^ e ^ that is a ^-coloring of V. Set V a F = V a [Iu«=ol + it is 

easy to see that the factor on the right of the formula is a (r, r - l)-step function with steps 
P = (Pi,..., P,„). We estimate the deviation of each pair and V a F from above in the 
r-cut norm, for this we fix the symmetric Si,..., S r C [0, l] h 4 r_1 D. Then we have 


I _ y a fi 

t 

1 Ijafi _ ya,p 

'Jo'lelrWfiSi) 

a\,...,a r =\ 



1 ^le[r]Pl 1 ( S l rP a l ) 


1 U a L 
(U a - V a )[I U a =0 - + lua>0 —] 


ai,...,a r = 

< \\u a - V‘ 


I n.r.P- 


Taking the maximum over all symmetric measurable r-tuples Si,..., S r and summing up 
over all choices of a and /3 delivers the upper bound we were after. □ 
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5 Proof of the main result 


The central tool in the main proof is the following lemma which can also be of independent 
interest. Informally it states that every coloring of a sampled r-graph can be transferred 
onto the graphon from which the graph was sampled from, such that another sampling 
procedure with a much smaller sample size cannot distinguish the two colored objects. 

Lemma 5.1. For every r > 1, proximity parameter b > 0, palette sizes t,k > 1, sampling depth 
qo > 1 there exists an integer q tw = q tw (r,b, q 0/ t, k) > 1 such that for every q > q tw the following 
holds. Let U = (U a ) ae [ t ] be a t-colored r-graphon and let V a denote Wc( q u a )for each a e [ t ], also let 

V = (V°) ae [f], so W G(g,u) - V. Then with probability at least 1-6 there exist for every k-coloring 

V = (V a 'P) ae [tip € [k\ ofV a k-coloring U = (U a, P) ae [tipe[k\ of U = (U a ) ae [t] such that we have that 


dtw (piq^W), p(q 0 ,U) < 5. 

Proof We proceed by induction with respect to r. The statement is not difficult to verify 
for r — 1. In this case the 1-graphons IF and V a can be regarded as indicator functions 
of measurable subsets B a and A a of [0,1] (so for each a G [7c] we have IF = Ib« and 
V a = I„«) that form two partitions associated to U and V respectively. Note that (A a ) ae [i q is 
obtained from (B a ) ae [ k ] by the sampling process. A /c-coloring corresponds to a refinement 
of these partitions with each original class being divided into k measurable parts, that is 
A a = U j, e[k] A a F and = l A «f. Moreover, |f(F,U) - f( F, V)| = | Y\]° =l A(B F(/) ) - A(A F «)| 

for any of ^-coloring U of U and for any [f] x [/c]-colored F on q 0 vertices. We may define a 
suitable coloring by partitioning each of the sets B tt into parts (B a F)^ k] so that the classes 

satisfy A(B a '^) = A(B a ) A ^ A '^ when A(A a ) > 0, and A (B a, P) = A(B a )| otherwise for each 
1 6 6 [k]. Then by setting U a L = and U = (U a F) ae [ t ],pe[ k ] we have that 


dtw (p{q 0 ,W), p(q 0 , U) 


\ X |t(F,U)-f(F,V)|< 

Z F:|V(F)|=<j 0 



max |A(A a ) - A(B rt )|, 

ae[k] 


where the sum runs over all [f] x [/c]-colored 1-graphs F on qo vertices. 

The probability that for a fixed a G [f] the deviation |A(A‘') - A(B‘ r )\ exceeds -ff is at 

2 2 ^ 

most 2exp(-^^-), the union bound gives the upper bound exp(ln2 + t - T-jfff) for the 

probability that 

dtw (p(q 0 ,W), p(q 0 , IJ) < 6 


fails for our particular choice for the coloring U of U. We note that the failure probability 
can be made arbitrary small with the right choice of q, so in particular smaller than 6, 


(f+ln2— ln5)3qi k+2 . ... . ... , . . 

^2 satisfies the conditions of the lemma. 


therefore t/ tvv (l, 6, q 0 , t, k ) = 

Now assume that we have already verified the statement of the lemma for r - 1 and 
any other choice of the other parameters of q tw . Let us proceed to the proof of the case for 
r-graphons, therefore let 6 > 0, t, k, qo > 1 be arbitrary and fixed, q to be determined below 
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and U, V, and V as in the condition of the lemma. We start by explicitly constructing 
a /c-coloring U for U, in the second part of the proof we verify that the construction is 
suitable. 

In a nutshell, we proceed as follows. We approximate V by the step function Z, and set 
Z = [Z, k], and also approximate U by Wi. Let W 2 be the sampled version of Wi generated 
by the same process as V. This way W 2 and Z are close, hence we can color W 2 using 
the coloring Z of Z to obtain W 2 . The coloring W 2 is then transferred onto W 2 using the 
induction hypothesis applied to the marginals of the step sets of W 2 and W 2 to get W, with 
[Wi,fc] = W|. Finally we color U exploiting the proximity of U and W 2 and the colored 

Wi. 

Our construction may fail to meet the criteria of the lemma, this can be caused at two 
points in the above outline. For one, it may happen, that W 2 does not approximate V 
well enough, and secondly, when we transfer W 2 onto Wi using the induction hypothesis 
with r - 1, as the current lemma leaves space for probabilistic error. These two events are 
independent from the particular choice of V and their probability can be made sufficiently 
small, we aim for to show this. We proceed now to the technical details. 

Let A = -—Setf 2 = f reg (r, tk, A, 1) and t\ = f reg (r, t, A/2, f 2 ), and define q tw (r, 6, qo,t,k) = 

ik(kt) °cj r 0 

ma x{q tw (r - 1,6/4, q 0r t lr f 2 ), q cut (r, L A/2, t\t 2 )}. Let q > q tw (r, 6, q 0 , t, k) be arbitrary. 

First we apply Lemma 4.2 with proximity parameter A/2 to the f-colored r-graphon 
U, the lemma ensures the existence of a symmetric partition P = (Pi,... ,P tl ) of [0, l] h T -1 ]) 
with t<p < t\ and a f-colored symmetric step function W 2 = (W],..., ) with steps in P 
that satisfies sup QtQ<tiph ^□ /) - / < 3 (W 1 , U) < A/2, where the supremum runs over all symmetric 
partitions Q of [0, l] h tl r_1 l) with at most tpt 2 classes. Applying structure preserving trans¬ 
formations to [0, l] h T' _1 l) the classes of P can be considered as piled up, meaning that for 
each y e [0, l] h(|r ~ 1 F'- 21 the fibers { 1 /} x [0,1] are partitioned by the intersections with the 
classes of P into intervals [0,«i), [ci\,ci 2 ),..., [a tl -i,a tl ] with {y} x [aj-i,aj) = ({i/} x [0,1]) 

We introduce the r-dimensional real arrays Ai,..., A t in order to describe the explicit form 
of the W/'s. So, 

t<p r 

(^h([r] ,r-l)) — E Aa/h/ • • •, i)-) n 

i\,...,i r =l 1=1 

Let W 2 = (W^,..., Wp denote G(q, Wi), so W“ stands for G (q, W“) for each a 6 [f], then 
Lemma 3.18 implies that for q > q cut (r, t, A/2, hf 2 ) it holds that 

sup d D/ r iQ (W 2 ,V) < A, 

Q,tQ<t<pt2 

with probability at least 1 - A/2, since tp < r\. Also, 

tp r 

wf(xh([r],)-i)) = ^ A a( fi. 

*i,...,*r=l /=1 
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for each a G [f] and 


P) = U (P1 , Pr _ l)e : ; .[——— ] x • • • x [-—-, V -\ x [0,1] x • • • X [0,1] 

C I c l c l c l 

with Ij = : X r[{pi . ; , r _,i] G P t \ for every j G [t P ]. Note that V = (P') ;e[tp] is a 

symmetric partition. 

We apply now Lemma 4.2 with proximity parameter A in order to approximate the 
[t] x [/c]-colored r- graph V = {V a ^) ae [ t ]^ e [k], the outcome is a [t] x [/c]-colored step function 
Z = (Z^) aeme[k] with symmetric steps in % = (R lx ..., R h ) of [0, l] h([r 1])) with t K < t 2 that 
satisfies 


sup d D/r/ Q(y, Z) < A. 

Q,tQ<t<R 


We introduce the f-colored step function Z = [Z, k] that is the fc-discoloring of Z that has 
steps in < R and note that 

sup d D 'r /Q (V, Z) < A, 

Q,tQ<t-R 

and therefore 

sup da /r/ Q(Z, W 2 ) < 2A. (5.1) 

Define the r -arrays Bi,..., B t such that for each a G [f] it holds that 


r 

Z a {Xh([r],r-l)) - ^ B a (z'i, . . . ,Z r ) ]^[ Ir. (*h([r]\|/])). 




Z=1 


further define also the r -arrays (B^) ae [ f ] /jSe [)t] such that 

r 

Z a 'P(Xh([ r lr-l)) = ^ B^(Zi, . . • , i r ) ]^[ Ir, ; (^h([r]\{/})) 

ii,...,i r =l 1=1 

for each a G [t],^ G [fc]. Clearly, B«(z i, ..., zV) = E/j=i • • •, ir)- 

Our aim next is to find a /c-coloring of W 2 so that the new f/c-colored r-graphon obtained 
is close to Z. In order to produce the coloring we apply Lemma 4.3 relying on (5.1), hence 
we obtain W? with [W 2 , k] = W 2 . The proximity between the two tk -colored /'-graphs can 
be quantified by 

sup d D)rr Q(Z,V\f 2 ) < 2kA. 

QftQ^K 

The graphon W 2 is a symmetric step function with steps that form the coarsest partition 
that both refines V and %, we denote this symmetric partition of [0, l] h( [ r_1 ]) gy its 
number of classes satisfies t s = t<p4 K < t\t 2 . 

Let us define the tp-colored (r - l)-graphon w = (w 1 ,... ,zv tp ) that is obtained from 
the classes of the partition P by integrating along the coordinate corresponding to the 
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set [r - 1], that is w l (x h([r—i],r— 2 )) = J 0 2p,(^h([/--i]))dx[ r _ 1 ]. In the same way we define the 
fp-colored (r - l)-graphon u = (u 1 ,..., u tp ) corresponding to the partition P', as well as the 
[t>p] X [^-colored u = (iWX'eMALW where it holds that u = [u, tp] and u is the ^-coloring 
of u corresponding to the partition S. Note that w,u, and u satisfy the usual symmetries, 
since their origin partitions were symmetric. As the partition P' was constructed via the 
same sampling procedure as V and W 2 , therefore it holds that u = G(^, w) and u‘ = G (q, w 1 ) 
for each i £ [t?>\. 

We can assert that due to the induction hypothesis there exists a ^-coloring w = 
(w ! ' ; ')/e[tp] ; ;e[f«] of w that satisfies 


rftw(fi(^0/W),p(^ 0 ,u) < 6/4 

with probability at least 1 - 6/4 for q > q tw (r - 1,6/4, q 0 , t\, t 2 ). 

We construct a /c-coloring for W, next. Recall the proof of Lemma 4.3, therefore we 
have that 


t<p 


t*R 


'\ x h(lr],r-l)) ~ ^ ^ A a (li, • • • / h) 


B«(/l / • • • / jr) 


IiS a >0 + -j^Ba=0 


n ip L n ^ (*h([r]\{m)))/ 


m =1 


and set A p ((h, ;'i),..., /,)) = A fl (ii,..., i r ) 


Bj(h .M 


I R >n + rip 


B a (jl,...,j r ) Ba>0 T k^ B a=0 


(5.2) 

for all n 6 [f], j6 £ [/c] and 


((*1/ jl)r • • • / (jr/ jr )) £ ([M x [f«])' • 

We utilize the f^-coloring w of the (r - l)-graphons w to construct a refined partition 
of ‘P that resembles 5 in order to enable the construction of a ^-coloring of W 2 along the 
same lines as in (5.2). Let 


P irj = {x £ [0, l] h([r - 1]) : 

i -1 7-1 i-l j 

w'(Xh ([r _l] ;) ._2)) + W !,/ (Xh([r-l],r-2)) < *[r-l] < Y W l {x H[ r-l],r-2)) + W U (x h ([r-lLr-2))} 

1=1 1=1 1=1 1=1 


for each i £ [t f >] and j £ [t n \. Let P" = (Pi,j),e\t fJ mt K \- 

Clearly, (Pi,j)je[t K ] is a -partition of the set P„ and ] /f - 2 )) = f p dxh([r-i])- We are 

able now to describe the /c-coloring of the Wi, define 

tp t<R r 

= E E Aj((il, jl), • • • , ( ir , /,)) n C r h([r]\{m|))- (5-3) 

*i/—/*r=l 7l,-,7r=l m=l 

Note that W| is a step functions with a step partition that refines P into P", but the 
regularity property of Wi allows for 


d n,r/P" (U, Wi) < A/2. 
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Finally we employ Lemma 4.3 that produces a /c-coloring U of U, so that U is [t] x [7c]- 
colored, and 

, ,- - . /cA 

dn,r,P"( U,Wi) < —. 

It remains to show that this coloring satisfies the requirements of the current lemma for a 
large enough q. 

In the first step of the coloring construction we employed the r-graphon version of the 
intermediate regularity lemma. Lemma 4.2, therefore we can assert that for each tk- colored 
F we have by means of Lemma 2.6 that 

V U/T7 Cn i/tj *M / rCr 6 

2^ |f(F, V) - f(F,Z)| < fl n ,r(V,Z) < —, 

F< f 

so we can conclude that d tw (jj.(q 0r V), fJ.(qo, Z)) < J-. 

In the next step, as a consequence of Lemma 4.3 and Corollary 2.7, we have for W 2 that 
dtwi^iqo, Z), fi{qo, W 2 )) < 6/4. 

We will next elaborate on the correctness of the inductive step of the construction. Let 
us consider the tk- colored random r- graph G(t/ () , W,), it is generated by the independent 
uniformly distributed [0, l]-valued random variables {Ys : S E h([^ 0 ], r)}- The color of 
each edge e = {e ir .. .,e 2 } E is decided by determining first the unique tuple (up to 
coordinate permutations) ((q, ;‘i),..., (z r , j r )) G ([fy>] X [^]) r such that (Y s )seh( e \| ei )) G P ll:h , and 
then check for which pair a E [t], p E [ k] it holds that 


a-1 1 

2^ h)' • • • / (c ;<■)) + ^ h)> • • • / (g ;V)) < Y e 

Z=1 Z=1 

a-1 P 

— A z((h' 7l)/ • • • / Hr/ jr)) "t A a((h' 7l)/ • • • / (jr/ jr))/ 

1=1 1=1 

then add the color (u,/3) to e with corresponding index. It is convenient to view this 
process as first randomly ^/-coloring an (r - l)-uniform template hypergraph Gi, whose 
edges are the simplices of the original edges, here we add a color (z, j) to an (r - l)-edge 
C whenever (Ys)seh(e') G P, ,, and conditioned on Gi we subsequently make independent 
choices for each edge to determine their color based on the arrays by means of the 
random variables {Y$ '■ S E (^)} at the top level. 

Let us turn to the tk -colored G(qo, W 2 ), the above description of the random process 
generating this object remains conceptually valid also for this random graph, the r -arrays 
A„ are identical to the case above, only the partition V" has to be altered to S. Similarly 
as above, we introduce the random (r - l)-uniform fp/'-colored hypergraph G 2 that is 
generated as above adapted to G(q 0 , W 2 ). That means that the (r - l)-edges are colored 
by indices of the classes that form the partition S through the process that generates 
G(zjo/ W 2 ), see above. The key observation here is that conditioned on Gi = G?, one can 
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couple G(q 0/ Wi) and G(cj 0 ,'W 2 ) so that the two random graphs coincide with conditional 
probability 1. Recall that a coupling is only another name for a joint probability space for 
the two random objects with the marginal distributions following fi(qo, Wi) and /j.(q 0 , W 2 ) 
respectively As the conditional distributions for the choices of colors for the r-edges 
are identical provided that Gi = G 2 the coupling is trivial. In order to construct a good 
unconditional coupling we require another coupling, now of Gi and G 2 , so that P(Gi + G 2 ) 
is negligible small for our purposes, and whose existence is exactly what the induction 
hypothesis ensures, when q is large enough. 

As q > q tw (r -1,6/4, q 0/ t\, t 2 ), the induction hypothesis enables us to use that there exist 
for any u a w so that d tw (jj.(q 0 , u), /i (qo , w)) < 6/4 holds with probability at least 1 - 6/4 for 
each u simultaneously, which in turn implies that there is a coupling of the fif 2 -colored 
random (r - l)-graphs Gi and G 2 so that P(Gi ± G 2 ) < 6/2. 

It follows that there exists a coupling of G (q 0 , Wi) and G (q 0 , W 2 ) such that P(G(^ 0 / Wi) ^ 
G(^ 0 / W 2 )) < 6/2 due to the discussion above, which in turn implies 

rftw(f^o,W 1 ),p(^o,W 2 ))<6/4. 

Since Wi has at most t<pt 2 steps, another application of Lemma 4.3 provides the bound 

dtwi^iqo, W), /i(^ 0 , U)) < 6/16, 

as < f. 

Evoking the triangle inequality and summing up the upper bounds on the respective 
deviations we conclude that 

^tw(f'(^ 0 /V),p(^ 0 /U)) < d tw (/j,(q 0 , V), /i(^ 0 , Z)) + d tw ([i(q 0 , Z), fj,(q 0 , W 2 )) 

/I 1 1 1 \ 

+ dtwiniqo, W 2 ), [i(qo, Wi)) + d tw (j,i(q 0 , Wi), [i(q 0r \J)) < y— + - + - + —jd < 6, 

the overall error probability is at most 6/2 + A/2, which is at most 6. 

□ 


With Lemma 5.1 at hand we can overcome the difficulties caused by properties of the 
r-cut norm for r > 3 in contrast to the case r — 2, we turn to prove the main result of the 
paper. 


Proof of Theorem 2.3. We regard simple hypergraphs as 2-colored r-graphs, in the fol¬ 
lowing the term simple should be understood this way at each appearance. Let the 
2/c-colorecl witness parameter of the nondeterministically testable r -graph parameter / 
be denoted by g, whose sample complexity is at most q g {e) for each proximity pa¬ 
rameter e > 0. Set d(r,£,q 0r k,t ) = kgk) o<? 0 ] e > q be fixed and define 

q f (e) = max{^ tw (r, e/4, q g (e/ 4), k, 3); |^(e/2); d(r, e/4, q g (e/ 4), k, 2)}. We will show that for 
every q > q f(e) the condition 


P(|/(G)-/(G(^(e),G)|>e)<e. 
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is satisfied for each G. Let q > qf(s) arbitrary but fixed and G be a fixed simple graph on n 
vertices. 

First we show that f(G(q, G)) > /(G) - e/4 with probability at least 1 - e/4. For this 
let us select a k- coloring G of G such that /(G) = g( G), then the random fc-colored graph 
F = G (q, G) is a /c-coloring of G (q, G), therefore f(G(q, G)) > y(F), but since q > q q (e/ 4) we 
know from the testability of g that y(F) > g(G) - e/4 with probability at least 1 - e/4, which 
verifies our claim. 

The more difficult part is to show that f(G(q, G)) < /(G) + e with failure probability at 
most e/2. Let us denote the random r -graph G (q, G ) by F. We claim that with probability 
at least 1 - e/2 there exists for any /c-coloring F of F there exists a /c-coloring G of G such 
that |g(F) - y(G)| < e, this suffices to verify the statement of the theorem. 

Our proof exploits that the difference of the g values between two colored r-graphs F 
and G can be upper bounded by 

W) - S(G)| < ly(F) - g{G(q g (e/4), F))| + \g(G) - g(G(q g (e/4), G))| < e/2, 

whenever there exists a coupling of the two random 2/c-colored r-graphs G(^ ? (e/4), G) and 
G(q ; ,(t:/4), F) appearing in the above formula such that they are equal with probability 
larger than e/2. Set qo = rj, s , (e/4). We will show that with high probability fore every F 
there exists a G that satisfies the previous conditions. 

Recall that coupling is a probability space together with the random r-graphs Gi and 
G 2 defined on it such that Gi has the same marginal distribution as G(qo, G) and G 2 has 
the same as G(^o/ F), their joint distribution is constructed in a way that serves our current 
purposes by maximizing the probability that they coincide. When the target spaces are 
finite as in our case then a coupling that satisfies this condition can be easily constructed 
whenever d tw (j,L(q 0 , G), [i(q 0 , F)) < 1 - e/2, see (2.3). 

By Lemma 5.1 for 3-colored r-graphs (there are 3 types of entries in the graphon 
representation of simple r-graphs, edges, non-edges, and diagonal elements) it follows 
that with probability at least 1 - e/4 for each F there exists a 3/c-colored U with [U, k] = W G 
such that t/wO^O/U), n(qo, W F )) < e/4. Let us condition on this event and let F be fixed. 
From (5.1) we know that d tw ([J.(qo, G), n(q 0/ W G )) <q 2 0 /n< e/4 and d^ifiiqo, F), n(q 0f W F )) < 
q^/q < e/4. Further, it follows from our condition above that there exists a 3/c-colored U that 
induces a fractional coloring of G, and d tw (p(^ 0 ,Lf),p(^ 0 /W F )) < e/4. It remains to produce a 
3/c-coloring of W G from any fixed 3/c-colored U (k of the colors of U correspond exclusively 
to diagonal cubes, so can be neglected). We do this by randomization, let (Xj,)/<=[„] be 
independent uniform random variables distributed on —], and let (X s ) Seh ([n],r)\h([«] / i) 

be independent uniform random variables on [0,1]. We can define W G to take the color 
U(X h(e )) on the set [^,|] x ••• x [/r,f] X [0,1] X x [0,1] for e = {e 1/ ... / e r }. For any 
fixed H G Q'^ k basic martingale methods deliver 


P(|f(H, W G ) - f(H,U)| > 6) < 2exp(- 


5 2 n 
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for any 6 > 0, therefore when setting <5 


d-tw (F(?0/U)^(?o,W g )) 



we get that 


\ Yj I^(H/W g ) — f(H,U)| < e/4 

H <* 


with probability at least 1 - e/4, since n > q > d(r,£/4,q 0r k,2) = [ ‘ /i|lnl — Init ' 2 4)] ^ 1 ( A> fsl 
Summing up terms gives 


4v(|U(^ 0 ,F), p(q 0 , G)) < d tw (p(q 0 ,T), p(q 0 ,Wr)) + d^ip^W?), p(q 0 ,U)) 
+ dtwiniqo, U), p(q 0f W G )) + d tw (p(q 0 , W G ), p(q 0 , G)) < e, 

with failure probability at most e/2, this concludes our proof. 


□ 


6 Nondeterministically testable hypergraph properties 

The concept of nondeterministic testing was originally introduced for testing properties 
by Lovasz and Vesztergombi [14], and remarkable progress has been made in that context, 
see [8] and [14], the estimation of parameters, which is our main issue in this paper, is 
in close relationship to that concept. For related developments in combinatorial property 
testing using regularity methods we refer to [2], 

We present the definition of testability of properties in the usual and in the nondeter¬ 
ministic sense and construct a tester from the tester of the witness property with the aid 
of Lemma 5.1 that achieves the same sample complexity as in the parameter testing case. 
This result connects our contribution to previous efforts more directly and answers the 
question posed in [14] asking if the equivalence of the two testability notions persists for 
uniform hypergraphs of higher order similar to the case of graphs. 

Definition 6.1. An r-graph property P is testable, if there exists another r-graph property P called 
the sample property, such that 

(a) P(G (q, G)) £ P) > | for every Gef and q > 1, and 

(b) for every e > 0 there is an integer qp{f) > 1 such that for every q > qp(e) and every G that is 
e-far from *P we have that P(G (q, G)) £ P) < |. 

Testability for colored r-graphs is defined analogously. 

We remark that e-far here means that one has to modify at least e|W(G)| 2 edges in 
order to obtain an element of P. Note that | and | in the definition can be replaced by 
arbitrary constants 1 > a > b > 0, this change may alter the corresponding certificate P 
and the function q<p, but not the characteristic of P being testable or not. Let P n denote the 
elements of P of size n. 

Next we formulate the definition of nondeterministic testability. 
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Definition 6.2. An r-graph property P is nondeterministically testable, if there exists an integer 
k> 1 and a Ik-colored r-graph property Q called the witness property that is testable in the sense of 
Definition 6.1 satisfying [Q,k] = {[G, /c]|G £ (3} = P (see Definition 2.2 above for the discoloring 
operation). 

We formulate next the main theorem in this section. 

Theorem 6.3. Every nondeterministically testable r-graph property is testable. 

Proof. Let P be a nondeterministically testable property with witness property <3 of 2de¬ 
colored r-graphs. We employ the combinatorial language with counting subgraph den¬ 
sities when referring to <3 and its testability, and the probabilistic language of picking 
random subgraphs in a uniform way when handling P in order to facilitate readability 
Let Q be the corresponding sample property that certifies the testability of <3 and be 
the sample complexity corresponding to the thresholds 1/5 and 4/5, that is 

(i) if G e <3, then for every and q > 1 we have t(Q q , G) > 4/5, and 

(ii) for every e > 0 if G is e-far from (3, then for every cj > qa(e) we have that f((3, ; , G) < 1/5. 

Our task is to construct a property P together with a function q P such that they fulfill the 
conditions of Definition 6.1. We are free to specify the error thresholds by the remark after 
Definition 6.1, we set them to 2/5 and 3/5. 

Let n be a positive integer and let e„ > 0 be the infimum of all positive reals 6 that satisfy 
n > max{q tw (r,l/10,qQ(5),3,k);100q2 ) (5);d(r / l/10,qQ(5),k,2)} from Lemma 5.1. Define for 
each n the set 

P n = [H £ Q r n |there exists a fc-coloring H of H such that t(Qq Q (e n y H) > 3/5}, 

andletF* = U f =l P„. Weset q P (e) = max{q tw (r, l/10,^ Q (e),2,/c); 100^(e); d(r, l/10,q Q (5),k,2)}. 
We are left to check if the two conditions for testability of P hold with P and q<p described 
as above. Assume for the rest of the proof that n > q P (e n ) for each n for simplicity, the 
general case follows along the same lines with some technical difficulties. 

First let G eP, we have to show that for every q > 1 integer we have that G (q, G) £ P q 
with probability at least 3/5. 

The condition G £ P implies that there exists a a fc-coloring G of G such that G £ Q. 
From the testability of Q it follows that t(Qi,G(l,G)) > 4/5 for any l > 1. Let q > 1 be 
arbitrary, and let F denote G(q, G), furthermore let F = G(q, G) generated by the same 
random process as F, so F is a ^-coloring of F. We know by a standard sampling argument 
that 


/ 

F (l f (<2toA)' G ) - f (& to ( £ ,),F)| > 1/5) < 2 exp 


v 


50 <&(£,) J 


( 6 . 1 ) 
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and the right hand side of (6.1) is less than 2/5 by the choice of £ q , since by definition 
q > 100 q 2 Q (£ q )- It follows that t(Q qa ( Eij ), F) > 3/5 with probability at least 3/5, so by the 
definition of P we have that F £ P q with probability at least 3/5, which is what we wanted 
to show. 

To verify the second condition we proceed by contradiction. Suppose that G is e-far 
from P, but at the same time there exists an l > q-p{£) such that F £ Pj with probability 
larger than 2/5, where F = G(/, G). 

In this case, the latter condition implies that with probability larger than 2/5 there 
exists a ^-coloring F of F such that t(Q qa ( El ), F) > 3/5. By Lemma 5.1 and the proof of 
Theorem 2.3 there exists a ^-coloring G of G such that d tw (p(qQ(£i), F), p(qa(£i), G)) < 22/100 
with probability at least 4/5, in particular 

|f F) “ KQq Q (ei)r G)| < ^qq/ 

which implies that with probability at least 1/5 there exist a G such that t(Q qa ( ei ), G) > ^. 
We can drop the probabilistic assertion and can say that there exists a ^-coloring G of G 
such that t(Q qQ ( £l ), G) > because G and the density expression are deterministic. 

On the other hand, the fact that G is e-far from P implies that any /c-coloring G of G 
is e-far from <3, which means that t(Q q , G) < 1/5 for any fc-coloring G of G and q > ^Q(e). 
But we know that ^(e/) > t/g(e), since e; < e which delivers the contradiction. The last 
inequality is the consequence of our definitions, e; is the infimum of the 5 > 0 that satisfy 
l > qp(c 5), and on the other hand, l > q<p(£). 

□ 


7 Further research 

It would be very interesting to shed light on the explicit sample complexity bounds for the 
witness parameter in Theorem 2.3. The only ingredient of our proof which is non-effective 
is the part which deals with the ultralimit method in the proof of Theorem 3.15, to our 
knowledge an effective proof regarding this result is only known for r = 2. 
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